Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AWS] [S3] fix: improve object size metric calculation #41755

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Kavindu-Dodan
Copy link
Contributor

@Kavindu-Dodan Kavindu-Dodan commented Nov 22, 2024

Proposed commit message

This fixes a bug with Cloudflare logpush integration, which uses the S3 input.

#40775 introduced a feature introducing s3_object_size_in_bytes histogram. Internally it utilized S3 SDK's GetObjectOutput instances' ContentLength property 1. In code level (SDK internally), this is derived using Content-Length header 2.

It seems that Cloudflare R2 (API compatible with S3) could omit Content-Length header 3. Hence, I have improved how we derive the s3_object_size_in_bytes metric by avoiding using Content-Length backed method.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

  • Build filebeat from this branch
  • Enable monitoring endpoints
  • Configure buckets with data and observe metrics in monitoring endpoint

Screenshots

See metric comparison with proposed collection method (before removing content length usage). Tracked total bytes are same as content length.

Handling validated for gzip

image

Handling validated for plain text

image


Footnotes

  1. https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html#API_GetObject_ResponseSyntax

  2. https://github.com/aws/aws-sdk-go-v2/blob/service/s3/v1.60.1/service/s3/deserializers.go#L5477-L5484

  3. https://community.cloudflare.com/t/no-content-length-header-when-serving-plaintext-from-r2/565521

@Kavindu-Dodan Kavindu-Dodan added bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team backport-8.x Automated backport to the 8.x branch with mergify backport-8.16 Automated backport with mergify labels Nov 22, 2024
@Kavindu-Dodan Kavindu-Dodan requested a review from a team November 22, 2024 19:20
@Kavindu-Dodan Kavindu-Dodan requested a review from a team as a code owner November 22, 2024 19:20
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Nov 22, 2024
Signed-off-by: Kavindu Dodanduwa <[email protected]>
Signed-off-by: Kavindu Dodanduwa <[email protected]>
@Kavindu-Dodan Kavindu-Dodan force-pushed the fix/improve-object-size-metric-calculation branch from 78c0c74 to 04531f3 Compare November 22, 2024 19:47
@@ -36,7 +36,7 @@ func init() {
// currentTime returns the current time. This exists to allow unit tests
// simulate the passage of time.
func currentTime() time.Time {
clock := clockValue.Load().(clock)
clock, _ := clockValue.Load().(clock)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - had to fix this to get lint correct

Comment on lines 212 to +214
totalBytesRead *monitoring.Uint

trackBytes int64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the naming could be more clear. "trackBytes" does not mean much to me. Maybe...

	totalBytesReadMetric *monitoring.Uint
	totalBytesRead       int64

return n, err
}

func (m *monitoredReader) GetTrackedBytes() int64 {
Copy link
Member

@andrewkroh andrewkroh Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The accessor seems unnecessary. The caller can read straight from the field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify backport-8.16 Automated backport with mergify bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants